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Abstract — We derive the maximum entropy of a flow (infor- 
mation utility) which conforms to traffic constraints imposed by 
a generalized token bucket regulator, by taking into account the 
covert information present in the randomness of packet lengths. 
Under equality constraints of aggregate tokens and aggregate 
bucket depth, a generalized token bucket regulator can achieve 
higher information utility than a standard token bucket regulator. 
The optimal generalized token bucket regulator has a near- 
uniform bucket depth sequence and a decreasing token increment 
sequence. 

Index Terms — network information theory, token bucket traf- 
fic regulation, packet length schedule, quality of service 

I. Introduction 

In Internet Quality of Service (QoS) parlance, as a part 
of the service level agreement (SLA) between a subscriber 
(source) and an Internet service provider (ISP), a token bucket 
regulator (TBR) can be used to smoothen the bursty nature of a 
subscriber's traffic [1]. The SLA mandates that the ISP provide 
end-to-end loss and delay guarantees to a subscriber's packets, 
provided the traffic profile of the subscriber adheres to certain 
TBR constraints. The standard token bucket regulator (STBR), 
as defined by the Internet Engineering Task Force (IETF), 
enforces linear-boundedness on the flow and is characterized 
by the token increment rate r and the bucket depth B. We 
will be more general and consider a TBR in which the token 
increment rate and bucket depth (maximum burst size) can 
vary from slot to slot. Such a TBR, which we define as a 
generalized token bucket regulator (GTBR), can be used to 
regulate variable bit rate (VBR) traffic' from a source [2]. 
The continuous-time analogue of a GTBR is the time-varying 
leaky bucket shaper [3] in which the token rate and bucket 
depth parameters can change at specified time instants. In [3], 
the authors determine the optimal parameters (rates and bucket 
sizes) and apply it to the renegotiable VBR service. 

Our primary contribution is developing the notion of in- 
formation utility of a GTBR. Specifically, we derive the 
maximum information that a GTBR-conforming traffic flow 
can convey in a finite time interval, by taking into account the 
additional information present in the randomness of packet 
lengths. The idea of using a covert channel to convey side 
information^ in data networks has been investigated earlier in 

'For example, a pre-recorded video stream. 

^Information present in packets otiier than tiie actual packet contents. 



the classic papers [4] [5]. In this paper, the side information 
is considered in the lengths of the packets only. Of all the 
packet length schedules that conform to a given GTBR, our 
objective is to stochastically characterize the flow that has the 
maximum entropy. 

In [6], the authors have derived the information utility of an 
STBR and suggested a pricing viewpoint for its application. 
Our interest is more theoretical - we consider an STBR as a 
special case of a GTBR and describe a framework for their 
information-theoretic comparison. We investigate whether a 
GTBR can achieve higher flow entropy than an STBR and 
explain the properties of entropy-maximizing GTBRs. 

Section HI] explains our system model. In Section HllJ we 
derive the optimal flow entropy equation and define the 
information utility of a GTBR. In Section lTVI we formulate the 
optimal GTBR and derive a necessary condition. In Section 
IVl we compute the optimal GTBR. We interpret our results in 
Section and conclude in Section IVm 

II. System model 

"k h 'k Bk 



Fig. 1. Relative time instants of parameters defined in 0. 

Consider a system in which time is divided into slots and a 
source which has to complete its data transmission within N 
slots. In our discrete-time model, we will evaluate the system 
at time instants 0, 1, . . . , — 1, A^. The fc*'* slot is defined to 
be the time interval [k,k + 1). The traffic from the source is 
regulated by a GTBR. Define 

token increment for the k*^ slot 
bucket depth for the (fc + 1)*'' slot 
length of packet transmitted in the k*^ slot 
residual tokens at start of the fc*'* slot (1) 
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rk, Bk, £k and Uk, whose relative time instants are shown 
in Figure [J are all non-negative integers. Let r := 



2 



{ro,ri, . . . ,rjv-i) denote the token increment sequence and 
B := {Bq, Bi, . . . , Bm-2) denote the bucket depth sequence. 
The system starts with zero tokens; = 0. A GTBR TZ with 
the above parameters, written as 7^(iV, r, B), constrains the 
packet lengths according to 

£i < + V i : < z < TV - 1 (2) 

If (|2ji is satisfied, then £ = (£o, ^i, • ■ ■ , -^Af-i) is a conforming 
packet length vector and Ui evolves as 

Uj+i = min(uj + — £j, Bi) yi:0<i<N — 2 
UN = UN-l+TN-l - £n-i (3) 

If r, = r and B, = B for all i, then the GTBR 7^g(7V, r, B) 
degenerates to the STBR 7^s(iV, r, B). 

III. Information utility 

Consider a source which has a large amount of data to 
send and whose traffic is regulated by a GTBR. We seek to 
maximize the information that the source can convey in the 
given time interval or the entropy present in the source traffic 
flow in an information-theoretic sense. The maximum entropy 
achievable by any flow which is constrained by the GTBR 
7^(iV, r, B) is defined to be its information utility. The source 
can send information to the destination via two channels: 

i) Overt channel: The contents of each packet. Let £i be 
the length of a packet in bits. The value of each bit is 
or 1 with equal probability and is independent of the 
values taken by the preceding and succeeding bits. The 
packet thus contributes li bits of information. 

ii) Covert channel: We consider the length of a packet 
as an event and associate a probability with it. Thus, 
side information is transmitted by the randomness in the 
packet lengths. 

At time k, the only method by which past transmissions can 
constrain the rest of the flow is by the residual number of 
tokens ut- The key observation is that the future entropy 
depends only on the buffer level Uk at time k. So, Uk captures 
the state of the system. Entropy is a function of system state 
Uk and is denoted by Hk{uk). 

At time N, the source signals the termination of the 
current flow by transmitting a special string of bits (flag). The 
information transmitted by this fixed sequence of bits is zero. 

:.Hn{un) = (4) 

For a given state Uk of the system, if a packet of length £fe 
bits is transmitted with probabiUty pi^{uk), then: 

1) The overt information transmitted is £k bits. 

2) As the event occurs with probability pi,, (uk), the covert 
information transmitted is {—log2 pe^{uk)) bits. 

3) Since £k is random, Uk+i is also random (from (|3})- 
Thus, Hk+i{uk+i) is also a random variable. 



Adding all of the above and averaging it over all conforming 
packet lengths, we obtain the entropy of the current stage: 

Hkiuk) ^ ^ Pi^{uk)[£k -log^ipi^iuk)) + 
ek=o 

Hfc+i(min(ufc + Tk ~ tk^Bk))) 'i k = Q,...,N -I (5) 
Finally, the above probabilities must satisfy 

J2 PiM = 1 Vfc = 0,...,iV-l (6) 

£fc=0 

Let Pkiuk) {pa{uk),piiuk), - ■ ■ ,Puk+rk{uk))- Our ob- 
jective is to determine the sequence of probability mass 
functions'* (p^v-ii PAr_2' ' ' ' 'Po) which maximizes the flow 
enti-opy Hq{0) for a given GTBR 7^(iV, r, B). From © 

H*^{un) = 

From Q 

Hk{uk) = ^ pi^ [tk - log2(p£j + i/fc+i(min(ufe + 

rk-£k,Bk))) Vfc = 0,...,iV-l 

Given Hl^-^^{uk+l) V m^+i, there exists an optimum proba- 
bility vector = {pq,pI, . . . ,Pu^+rk) which maximizes the 
flow entropy Hk{uk)- 

■■• Hl[uk) = ^ p}^ [tk - logsblj + i?fc+i(min(Mfc + 

Tk-tk^Bk))) VA: = 0,...,7V-1 (7) 

Thus, the problem of computing the entire sequence of proba- 
bility vectors (p^v-ii PAr-2' ' ' ' ' Po) has now been decoupled 
into a sequence of subproblems. The subproblem for time k 
is: 

Given the function Hl.^^{uk+i) V Uk+i, determine the prob- 
ability vector Tpk ^ {po,Pi, ■ ■ ■ ,Puk+rk) so as to 

maximize ^ pi>^ [tk - log2(p£j + Hfc+i(min(Mfe + Tk 

ik=0 

Uk+Tk 

-tk,Bk)yj subject to pf^^l (8) 

(jSjl can be solved using Lagrange multipUers. 

Uk+Tk 

C{pk,\k)-^ ^ Pf^(^4-log2(wJ+i?fc+i(min(ufe+rfc 
-£k,Bk)))+\k{ P^k-'^) (9) 

lk=0 

^The dependence of and pj. on u^. is assumed to be understood and 
is not always stated explicitly. So, = (po,pi, ■ ■ ■ ,Puk+rk)- 
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At the optimal point (p^, A^) 
dC 

dC 

dXk (pI^K) 

Solving (tm 



d) The bucket depth of TZg cannot be very high compared 
to its token increment rate. 



V4 = 0,...,Mfc + rfe (10) 
(11) 



2r < B <5r 



(19) 



J2 pIK) = 1 



(12) 



For example, in [3], the authors use rmax = 6 Mbps 
and Bjnax = 12 Mbps for their simulations, 
e) The token increment rate of TZg at every stage must not 
be higher than the bucket depth of TZg- 



Solving (EHJ 



r, < B 



(20) 



From O and O 



(14) 



From (|T3} and ST4\ 



Piki'^k) = v-Ufc+rfc r,afe+ff- (min(«fc+r^-afc,B;,)) ^^^^ 

From Q and jl5> . we finally obtain 



The optimal GTBR problem is: 

Given an STBR TZ^iN, r, B), determine r and B of a GTBR 
7?.g(iV, r,B) so as to maximize Hq{Q) subject to ((TT}, ( flSl . 
^ and 

The following result significantly reduces the search 
space for the optimal GTBR. 

Proposition: For an optimal GTBR, equality must hold 
in ( I18> . except when N is small. 

Proof: We prove by contradiction. Define gk{u) — 2^kM _ 
Since Hl{u) > 0, gtiu) > 1. From ([l6j. 



Starting with H'^{un) — 0, we use (I16> to compute the 
optimal flow entropy H^{uk) for all Ufc and then proceed 
backward recursively for k = N ~ 1, N — 2, . . . ,0. The 
information utiHty of the GTBR is Hq{0). 

IV. Problem Formulation 

For the information-theoretic comparison of a GTBR 
ng{N,r,B) and an STBR ns{N',r,B), we impose the 
following conditions: 

a) TZg and TZs must operate over the same number of slots. 

N = N' 

b) The aggregate tokens of TZg and TZs must be equal. 

N-l 

Y^n = Nr (17) 

i=0 

c) The aggregate bucket depth of TZg must not exceed that 
of TZs\ 



N-2 



J2B^<{N- 1)B 



(18) 



j=0 



^Equality is present in <17l because every additional token directly trans- 
lates to the permission to transmit one more bit, leading to increase in 
information utility. As this may not be necessarily true for bucket depth, 
we permit inequality in llSt . 



dN-iiu) — 2"+''"-i+^ — 1 is an increasing sequence in u. 
Using ( 12 1> . we can show that gk{u) is an increasing sequence 
in M V fc = 0, . . . , iV — 1. Let (pi = maximum number of 
tokens possible at time i. Thus, (po ^ and 

= min((?i,_i + r,_i, V^ = 1, . . . , iV - 1 (22) 

If Ui < (pi, then we say that state Ui is reachable at stage i, 
otherwise it is unreachable. 

Let TZ{N, r, B) be an optimal GTBR, for which equality 
does not hold in (Esjl. Then E^o^ B, < {N - 1)B - 1. 
Consider another GTBR 7^'(iV, r', B') with r' = r and 
B' = {Bo, ■ ■ .,Bk-i,Bk + l,Bk+i, ■ ■ .,Bn-2) for some k. 
B' satisfies ([18}. g'i{u) = gi{u) V j = fc + 1, . . . , and V 
u. Since min(M + rk — £, Bk + 1) > min(M + rk — £, Bk), 
gk{mm{u + rk -£,Bk + 1)) > gk{mm{u + rk -l,Bk)) > 1. 
If we determine a reachable state u such that g'f.{u) > gk{u), 
then .90(0) > (?o(0), since the flow entropy at stage is 
computed stage-by-stage as a linear sum of future possible 
flow entropies with positive weights. Thus, the problem now 
reduces to determining a stage k and a reachable state u such 
that g'i^{u) > gk{u). One of the following must hold: 

Case 1 There exists ani G {1,...,A — 1} such that (pi — 

Bi-l < + r,;_i. 

Case 2 There is no i such that cpi — Bi-i < tpi-i + r^-i. 
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Case 1: Consider the smallest i such that (j>i — Bi^i < i - 
ri_i. Take k = i — 1. From i21i 



= ^ 2^.g,(B,_i)+ 2^ff»(w + n-i (23) 



J2 2^5,(B,„i + l)+ '2'9^{u + r^-l~i) (24) 

23 and (|23 hold only if 



M - 



B, 



1 > 



(25) 



u — 4>i-i is a state which is reachable in the original system as 
well as in the primed system and satisfies i25\ . Since gi{u) is 
an increasing sequence in u, ( I23t and i24\ imply g^_^{(t)i-i) > 

Consequently, 5^(0) > go{0). 
Case 2: If no such i exists, then Bi > ro + ■ ■ ■ + ri V i = 
0,...,N -2. Adding and using ^ 



N-2 



i=0 



> {Nr ~ rN-i) + (Nr - rN-i - rN-2) + 



> {Nr - B) + {Nr - 
= N{N - l)r - aB 



2B) 



(26) 
(27) 



From ( I17> . M9\ and ( I20> . we cannot have ri — B V i. So, 
a cannot be of the order of N"^. Thus, the lower bound on 
^iLo^ Bi given by i26\ and ( I27t is a loose lower bound. 
From ( I18> . ( I19> and ( I27K X^ilo^ -^i grows as N^ and is upper- 
bounded by 5(A^ — l)r, which is impossible, except when N 
is small. So, we discard Case 2. 

From the result of Case 1, H^' {0) > H^{0). So, our 
assumption that TZ is an optimal GTBR is incorrect. Therefore, 
equality must hold in (I18> for every optimal GTBR. ■ 



V. Optimal GTBR 

We determined the optimal GTBR by exhaustive search over 
the reduced search space obtained from the proposition. Our 
computation results are shown in Table |I] Hg and H* denote 
the information utility of the STBR TZs{N,r,B) and the 
optimal GTBR TZg{N,r* ,'B*) respectively. Based on our 
computations, we infer: 

1) A generalized token bucket regulator can achieve higher 
information utility than a standard token bucket regula- 
tor The increase in information utility is significant (up 
to 7.2%), esp. for higher values of B. 



(iV,r,B) 


r* 


B* 


Hs 
(bits) 


h; 

(bits) 


inc. 

(%) 


(4,3,6) 


(6 3 3 0) 


(6 6 6) 


20.04 


20.92 


44 


(4,3,9) 


(8 3 10) 
(9 2 10) 


(8 10 9) 
(9 10 8) 


20.10 


21.44 


6.7 


(4,3,12) 


(12 0) 


(12 12 12) 


20.10 


21.56 


7.2 


(4,4,8) 


(8 4 4 0) 


(8 8 8) 


25.08 


26.04 


3.8 


(4,4,10) 


(9 5 2 0) 


(9 12 9) 


25.13 


26.39 


5.0 


(4,4,12) 


(11 4 1 0) 


(11 14 11) 


25.14 


26.59 


5.8 


(4,4,16) 


(16 0) 


(16 16 16) 


25.14 


26.70 


6.2 


(4,5,10) 


(10 5 5 0) 


(10 10 10) 


29.91 


30.92 


3.4 


(4,5,12) 


(11 6 3 0) 


(11 14 11) 


29.96 


31.24 


4.3 


(4,6,12) 


(11 7 6 0) 
(12 7 5 0) 


(11 13 12) 
(12 13 11) 


34.60 


35.66 


3.1 


(5,3,6) 


(6 3 3 3 0) 


(6 6 6 6) 


25.68 


26.57 


3.5 


(5,3,9) 


(8 3 3 1 0) 


(8 10 10 8) 


25.88 


27.33 


5.6 


(5,3,12) 


(11 2 2 0) 


(11 13 13 11) 


25.90 


27.59 


6.5 


(5,3,15) 


(15 0) 


(15 15 15 15) 


25.90 


27.64 


6.7 


(6,3,6) 


(6 3 3 3 3 0) 


(6 6 6 6 6) 


31.33 


32.23 


2.9 



TABLE I 

Entropy-maximizing GTBR for given N, r and B. 



2) The optimal bucket depth sequence B* is uniform or 
near-uniform, i.e., the standard deviation is very small 
compared to the mean. 

3) The optimal token increment sequence r* is a decreasing 
sequence and is not uniform. 

4) For a fixed and r: 

a) If i? = 2r, B* is always uniform and r* is uniform 
except for the terminal values. 

b) As B increases from 2r to min(5, N)r, the vari- 
ance of r* increases rapidly with a concentration 
of tokens in first few stages, the variance of B* 
increases slowly, while H* initially increases and 
then saturates at some final value. H* is an in- 
creasing and concave sequence^ in B (Figure |2ji- 

5) For a fixed N and B, H* is an increasing, highly linear 
and slightly concave sequence in r (Figure |3|l. For the 
STBR, Results |4b] and |5l have been observed in [6]. 



VI. Information-theoretic interpretation 

From classical information theory, if Yll=iPi = 1' system 
entropy H increases with decreasing Kullback-Leibler dis- 
tance between the given probability mass function (pmf) and 
the uniform pmf. H is maximized only if pi = • ■ • = p„ = i. 
Also, maximum system entropy H* increases with n [7]. 
Analogously, a GTBR can achieve higher information utility 
than an STBR because the pmfs of the packet lengths at each 
stage are more uniform and have a larger support. For a given 
r and B, recall that information utility is computed recursively 
by (0} and (US). 

We argue that B* must be uniform or near-uniform for 
maximum information utility. If B* is neither uniform nor 
near uniform, then Bj = mini Bi is much smaller than B. 
This restricts the range of values taken by Uj+i and £j+i 

^The first-order differences form a decreasing, non-negative sequence. 
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N=4, r=3 

21 .6 1 1 1 1 1 



21.5 



^5-21 .4 




20.9 - 



■ 6 7 8 9 10 11 12 13 14 

time-averaged bucket depth B 

Fig. 2. H* vs. B is concave. 



N-4, B-15 

45 1 1 1 1 1 




3 3.5 4 4.5 5 5.5 6 6.5 7 

lime-averaged token rate r 



Fig. 3. H* vs. r is highly linear. 



the support of the packet lengths [0, tq] and the pmf of the 
packet lengths Po(0), while the contribution from is 
insignificant. So, to maximize Hq{Q), tq should be allowed 
to take its maximum possible value, subject to vq < Bq, 
and the pmf of the packet lengths should be close to the 
uniform pmf. The observation that rg = Bo consistently in 
Table |l] corroborates this. Also, a high value of tq leads to 
larger supports of packet length pmfs at intermediate and later 
stages. Similarly, the first few elements of r* tend to take large 
values till the aggregate tokens are exhausted. However, their 
contribution to Hq{0) is not as pronounced and equality may 
not hold in Vi < Bi. Thus, r* must be a decreasing sequence 
and the first few elements of r* tend to take their maximum 
possible values, subject to r.i < Bj, to achieve uniformity and 
larger supports of packet length pmfs at intermediate and later 
stages. 

This "greedy" nature of r* is evident when N and r 
are kept constant and B increases (Result |4b}. A similar 
argument is applicable when iV and B are kept constant 
and r increases (Result |5}- The only difference is that a unit 
increase in r will necessarily increase H* by at least N bits 
{N bits are contributed by the packet contents alone, which 
also explains the dominant linear variation in Figure |3}, while 
a unit increase in B will increase H* only by an amount 
equal to the difference in covert information. The increase 
in covert information is positive only if the resulting optimal 
token increment and bucket depth sequences (r*,B*) result 
in larger support and more uniformity for the packet length 
pmfs. Indeed, when B increases beyond the maximum number 
of tokens possible at any stage (maxi{(/)i}), clamping the 
residual number of tokens at every stage becomes ineffective 
and the system behaves as if bucket depth constraints were 
not imposed at all (Figure O. 



vn. Discussion 



(from (13 and Q). The support of packet length pmfs at stage 
j + 1 is reduced, leading to lower flow entropy at stage j + 1 
and consequently lower information utiUty. Thus, B* must be 
uniform or near-uniform to maximize the minimum support of 
the packet length pmfs at each stage. Also, in Table |l] observe 
that mini B* = B — 1 or min^ B* = B throughout. 

We now argue that for maximum information utility, r* must 
be a decreasing sequence, subject to < Bi for every i. 
If Ti > Bi for any i, then a zero length packet cannot be 
transmitted in slot i (from (|3j) and will have zero probability. 
This decreases the support of the packet length pmfs in slot i 
and leads to lower information utility. Importantly, from 

ro 

Ho (0) = Pi (0) (4 - log2 (pl (0)) + H*, (min(ro 

^0=0 

-io,Bo))) 

The major contribution to information utility Hq{0) is from 



In this paper, we have considered a problem where a source 
whose traffic is regulated by a generalized token bucket regu- 
lator, seeks to maximize the entropy of the resulting flow. The 
source can achieve this by recognizing that the randomness 
in packet lengths acts as a covert channel in the network 
and sizing its packets appropriately. We have formulated the 
problem of computing the GTBR with maximum information 
utility in terms of constrained token increment and bucket 
depth sequences. A GTBR can achieve higher information 
utility than a standard IETF token bucket regulator. Finally, 
we have information-theoretically interpreted the observation 
that an entropy-maximizing GTBR always has a near-uniform 
bucket depth sequence and a decreasing token increment 
sequence. 

Our results show the existence of upper bounds on the 
entropy of regulated flows. It would be interesting to construct 
source codes which come close to this bound. The develop- 
ment of a rate-distortion framework for a generalized token 
bucket regulator is currently under investigation. 
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